AITopics | hyperparameter optimisation

Collaborating Authors

hyperparameter optimisation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Improving Predictions of Molecular Properties with Graph Featurisation and Heterogeneous Ensemble Models

Parker, Michael L., Mahmoud, Samar, Montefiore, Bailey, Öeren, Mario, Tandon, Himani, Wharrick, Charlotte, Segall, Matthew D.

arXiv.org Artificial IntelligenceOct-28-2025

We explore a "best-of-both" approach to modelling molecular properties by combining learned molecular descriptors from a graph neural network (GNN) with general-purpose descriptors and a mixed ensemble of machine learning (ML) models. We introduce a MetaModel framework to aggregate predictions from a diverse set of leading ML models. We present a featurisation scheme for combining task-specific GNN-derived features with conventional molecular descriptors. We demonstrate that our framework outperforms the cutting-edge ChemProp model on all regression datasets tested and 6 of 9 classification datasets. We further show that including the GNN features derived from ChemProp boosts the ensemble model's performance on several datasets where it otherwise would have underperformed. We conclude that to achieve optimal performance across a wide set of problems, it is vital to combine general-purpose descriptors with task-specific learned features and use a diverse set of ML models to make the predictions.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1021/acs.jcim.5c01844

2510.23428

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Helix 1.0: An Open-Source Framework for Reproducible and Interpretable Machine Learning on Tabular Scientific Data

Aguilar-Bejarano, Eduardo, Lea, Daniel, Sivakumar, Karthikeyan, Mase, Jimiama M., Omidvar, Reza, Li, Ruizhe, Kettle, Troy, Mitchell-White, James, Alexander, Morgan R, Winkler, David A, Figueredo, Grazziela

arXiv.org Artificial IntelligenceJul-25-2025

The massive increase in data in scientific research requires the development and application of robust tools for data analysis and m achine l earning (ML) that are findable, accessible, interoperable, re usable (FAIR) and interpretable. In domains, such as b iomaterials s cience, e ngineering, c hemistry, h ealthcare and b io sciences, data - driven discovery typically requires interdisciplinary teams . These teams collaborate to implement unbiased data pre - processing strategies, select appropriate modelling techniques, and interpret model outputs to accelerate and inform research outcomes and support rational design and decision - making. This process is often iterative, with experts providing feedback over long periods of time to refine models and optimise the methodology adopted . In cases where initial analysis identifies issues with the data, such as outliers, unbalance d data classes, or experimental measurement uncertainty, another round of data collection and pre - processing might be necessary . That means that data for the same problem are likely to be analysed multiple times using different dataset versions and methodological pipelines. For interdisciplinary co - development of analytic s, there is also a need for tools that allow domain experts to focus on interpreting and using analysis results, rather than developing code . The widespread use of ML and the overwhelming availability of thousands of community - driven open - source packages in Python and R increases the barrier for interoperable and reusable data analysis methodologies . To facilitate accurate analy tics, transparency, and modelling results comparison, there is a strong need for easy - to - use tools that automatically track data, all methodological choices, performance metrics, and corresponding results.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2507.17791

Country: Oceania > Australia (0.14)

Genre: Research Report (0.86)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.31)

Add feedback

Improving Linear System Solvers for Hyperparameter Optimisation in Iterative Gaussian Processes

Neural Information Processing SystemsMay-26-2025, 17:52:20 GMT

Scaling hyperparameter optimisation to very large datasets remains an open problem in the Gaussian process community. This paper focuses on iterative methods, which use linear system solvers, like conjugate gradients, alternating projections or stochastic gradient descent, to construct an estimate of the marginal likelihood gradient. We discuss three key improvements which are applicable across solvers: (i) a pathwise gradient estimator, which reduces the required number of solver iterations and amortises the computational cost of making predictions, (ii) warm starting linear system solvers with the solution from the previous step, which leads to faster solver convergence at the cost of negligible bias, (iii) early stopping linear system solvers after a limited computational budget, which synergises with warm starting, allowing solver progress to accumulate over multiple marginal likelihood steps. These techniques provide speed-ups of up to 72\times when solving to tolerance, and decrease the average residual norm by up to 7\times when stopping early.

artificial intelligence, hyperparameter optimisation, machine learning, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.64)

Add feedback

Hyperparameter Optimisation with Practical Interpretability and Explanation Methods in Probabilistic Curriculum Learning

Salt, Llewyn, Gallagher, Marcus

arXiv.org Artificial IntelligenceApr-10-2025

Hyperparameter optimisation (HPO) is crucial for achieving strong performance in reinforcement learning (RL), as RL algorithms are inherently sensitive to hyperparameter settings. Probabilistic Curriculum Learning (PCL) is a curriculum learning strategy designed to improve RL performance by structuring the agent's learning process, yet effective hyperparameter tuning remains challenging and computationally demanding. In this paper, we provide an empirical analysis of hyperparameter interactions and their effects on the performance of a PCL algorithm within standard RL tasks, including point-maze navigation and DC motor control. Using the AlgOS framework integrated with Optuna's Tree-Structured Parzen Estimator (TPE), we present strategies to refine hyperparameter search spaces, enhancing optimisation efficiency. Additionally, we introduce a novel SHAP-based interpretability approach tailored specifically for analysing hyperparameter impacts, offering clear insights into how individual hyperparameters and their interactions influence RL performance. Our work contributes practical guidelines and interpretability tools that significantly improve the effectiveness and computational feasibility of hyperparameter optimisation in reinforcement learning.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2504.06683

Country: Oceania > Australia > Queensland (0.14)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Machine Learning-based Regional Cooling Demand Prediction with Optimised Dataset Partitioning

Zhang, Meng, Li, Zhihui, Yu, Zhibin

arXiv.org Artificial IntelligenceMar-4-2025

In the context of global warming, even relatively cooler countries like the UK are experiencing a rise in cooling demand, particularly in southern regions such as London. This growing demand, especially during the summer months, presents significant challenges for energy management systems. Accurately predicting cooling demand in urban domestic buildings is essential for maintaining energy efficiency. This study introduces a generalised framework for developing high-resolution Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) networks using physical model-based summer cooling demand data. To maximise the predictive capability and generalisation ability of the models under limited data scenarios, four distinct data partitioning strategies were implemented, including the extrapolation, month-based interpolation, global interpolation, and day-based interpolation. Bayesian Optimisation (BO) was then applied to fine-tune the hyper-parameters, substantially improving the framework predictive accuracy. Results show that the day-based interpolation GRU model demonstrated the best performance due to its ability to retain both the data randomness and the time sequence continuity characteristics. This optimal model achieves a Root Mean Squared Error (RMSE) of 2.22%, a Mean Absolute Error (MAE) of 0.87%, and a coefficient of determination (R square) of 0.9386 on the test set. The generalisation ability of this framework was further evaluated by forecasting.

extrapolation, interpolation, prediction, (17 more...)

arXiv.org Artificial Intelligence

2503.05813

Country:

Europe > United Kingdom (1.00)
Asia > China (0.14)
Africa (0.14)

Genre: Research Report > New Finding (0.88)

Industry:

Construction & Engineering (1.00)
Energy > Power Industry (0.88)
Energy > Oil & Gas > Upstream (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Robust and Conjugate Spatio-Temporal Gaussian Processes

Laplante, William, Altamirano, Matias, Duncan, Andrew, Knoblauch, Jeremias, Briol, François-Xavier

arXiv.org Machine LearningFeb-4-2025

State-space formulations allow for Gaussian process (GP) regression with linear-in-time computational cost in spatio-temporal settings, but performance typically suffers in the presence of outliers. In this paper, we adapt and specialise the robust and conjugate GP (RCGP) framework of Altamirano et al. (2024) to the spatio-temporal setting. In doing so, we obtain an outlier-robust spatio-temporal GP with a computational cost comparable to classical spatio-temporal GPs. We also overcome the three main drawbacks of RCGPs: their unreliable performance when the prior mean is chosen poorly, their lack of reliable uncertainty quantification, and the need to carefully select a hyperparameter by hand. We study our method extensively in finance and weather forecasting applications, demonstrating that it provides a reliable approach to spatio-temporal modelling in the presence of outliers.

artificial intelligence, machine learning, outlier, (17 more...)

arXiv.org Machine Learning

2502.0245

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
(2 more...)

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Evaluating Spoken Language as a Biomarker for Automated Screening of Cognitive Impairment

Lima, Maria R., Capstick, Alexander, Geranmayeh, Fatemeh, Nilforooshan, Ramin, Matarić, Maja, Vaidyanathan, Ravi, Barnaghi, Payam

arXiv.org Artificial IntelligenceJan-30-2025

Timely and accurate assessment of cognitive impairment is a major unmet need in populations at risk. Alterations in speech and language can be early predictors of Alzheimer's disease and related dementias (ADRD) before clinical signs of neurodegeneration. Voice biomarkers offer a scalable and non-invasive solution for automated screening. However, the clinical applicability of machine learning (ML) remains limited by challenges in generalisability, interpretability, and access to patient data to train clinically applicable predictive models. Using DementiaBank recordings (N=291, 64% female), we evaluated ML techniques for ADRD screening and severity prediction from spoken language. We validated model generalisability with pilot data collected in-residence from older adults (N=22, 59% female). Risk stratification and linguistic feature importance analysis enhanced the interpretability and clinical utility of predictions. For ADRD classification, a Random Forest applied to lexical features achieved a mean sensitivity of 69.4% (95% confidence interval (CI) = 66.4-72.5) and specificity of 83.3% (78.0-88.7). On real-world pilot data, this model achieved a mean sensitivity of 70.0% (58.0-82.0) and specificity of 52.5% (39.3-65.7). For severity prediction using Mini-Mental State Examination (MMSE) scores, a Random Forest Regressor achieved a mean absolute MMSE error of 3.7 (3.7-3.8), with comparable performance of 3.3 (3.1-3.5) on pilot data. Linguistic features associated with higher ADRD risk included increased use of pronouns and adverbs, greater disfluency, reduced analytical thinking, lower lexical diversity and fewer words reflecting a psychological state of completion. Our interpretable predictive modelling offers a novel approach for in-home integration with conversational AI to monitor cognitive health and triage higher-risk individuals, enabling earlier detection and intervention.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2501.18731

Country:

North America > United States > California (0.14)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
Europe > United Kingdom > England (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(3 more...)

Add feedback

Navigating Public Sentiment in the Circular Economy through Topic Modelling and Hyperparameter Optimisation

Song, Junhao, Yuan, Yingfang, Chang, Kaiwen, Xu, Bing, Xuan, Jin, Pang, Wei

arXiv.org Artificial IntelligenceMay-16-2024

To advance the circular economy (CE), it is crucial to gain insights into the evolution of public sentiments, cognitive pathways of the masses concerning circular products and digital technology, and recognise the primary concerns. To achieve this, we collected data related to the CE from diverse platforms including Twitter, Reddit, and The Guardian. This comprehensive data collection spanned across three distinct strata of the public: the general public, professionals, and official sources. Subsequently, we utilised three topic models on the collected data. Topic modelling represents a type of data-driven and machine learning approach for text mining, capable of automatically categorising a large number of documents into distinct semantic groups. Simultaneously, these groups are described by topics, and these topics can aid in understanding the semantic content of documents at a high level. However, the performance of topic modelling may vary depending on different hyperparameter values. Therefore, in this study, we proposed a framework for topic modelling with hyperparameter optimisation for CE and conducted a series of systematic experiments to ensure that topic models are set with appropriate hyperparameters and to gain insights into the correlations between the CE and public opinion based on well-established models. The results of this study indicate that concerns about sustainability and economic impact persist across all three datasets. Official sources demonstrate a higher level of engagement with the application and regulation of CE. To the best of our knowledge, this study is pioneering in investigating various levels of public opinions concerning CE through topic modelling with the exploration of hyperparameter optimisation.

circular economy, dataset, sustainability, (14 more...)

arXiv.org Artificial Intelligence

2405.10452

Country:

Oceania > Australia (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
Europe > Russia (0.04)
(13 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Water & Waste Management > Solid Waste Management (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Government (1.00)
(6 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.69)
(5 more...)

Add feedback

YAMLE: Yet Another Machine Learning Environment

Ferianc, Martin, Rodrigues, Miguel

arXiv.org Artificial IntelligenceFeb-9-2024

YAMLE: Yet Another Machine Learning Environment is an open-source framework that facilitates rapid prototyping and experimentation with machine learning (ML) models and methods. The key motivation is to reduce repetitive work when implementing new approaches and improve reproducibility in ML research. YAMLE includes a command-line interface and integrations with popular and well-maintained PyTorch-based libraries to streamline training, hyperparameter optimisation, and logging. The ambition for YAMLE is to grow into a shared ecosystem where researchers and practitioners can quickly build on and compare existing implementations.

experiment, hyperparameter optimisation, yamle, (12 more...)

arXiv.org Artificial Intelligence

2402.06268

Country: Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report (0.50)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Uncertainty in GNN Learning Evaluations: The Importance of a Consistent Benchmark for Community Detection

Leeney, William, McConville, Ryan

arXiv.org Artificial IntelligenceNov-25-2023

Graph Neural Networks (GNNs) have improved unsupervised community detection of clustered nodes due to their ability to encode the dual dimensionality of the connectivity and feature information spaces of graphs. Identifying the latent communities has many practical applications from social networks to genomics. Current benchmarks of real world performance are confusing due to the variety of decisions influencing the evaluation of GNNs at this task. To address this, we propose a framework to establish a common evaluation protocol. We motivate and justify it by demonstrating the differences with and without the protocol. The W Randomness Coefficient is a metric proposed for assessing the consistency of algorithm rankings to quantify the reliability of results under the presence of randomness. We find that by ensuring the same evaluation criteria is followed, there may be significant differences from the reported performance of methods at this task, but a more complete evaluation and comparison of methods is possible.

algorithm, arxiv preprint arxiv, dgi dmon grace mvgrl selfgnn, (11 more...)

arXiv.org Artificial Intelligence

2305.06026

Country:

North America > United States > Texas (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Nepal (0.04)

Genre: Research Report (0.40)

Industry:

Government > Regional Government (0.46)
Health & Medicine > Pharmaceuticals & Biotechnology (0.34)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
(3 more...)

Add feedback